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1.  Introduction 


In  conventional  digital  encryption,  a  plaintext  message  is  first  encode  using  a  key,  for  example, 
through  permutation  and  substitution,  to  obtain  a  ciphertext.  The  ciphertext  can  only  be  decoded 
if  the  recipient  has  the  same  key  used  for  encoding  so  that  the  reverse  permutation  and 
substitution  can  be  made  to  recover  the  plaintext  as  shown  in  Fig.  1. 
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Figure  1  Schematic  of  conventional  digital  encryption. 

Good  encryption  codes  cannot  be  reliant  on  design  being  kept  secret  and  must  be  resistant  to 

K 

known  plaintext  attacks.  For  a  K-bit  key,  the  algorithm  requires  brute  force  to  break  it  in  2 
attempts.  Encryption  algorithm  must  be  kept  simple  with  reasonably  short  key  to  be  cost 
effectively  implemented  using  digital  hardware.  However,  the  consequence  of  simplicity  is  that  a 
message  might  be  breakable  in  the  near  future,  e.g.  10  years,  with  advanced  in  computing  power. 
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Figure  2  Schematic  representation  of  NIFl  secure  communication 

In  this  project,  we  investigated  what  we  call  the  needle-in- the-haystack  (NIH)  secure 
communication.  It  builds  upon  digital  encryption.  Messages  (Digital  ciphertexts)  are  hidden 
within  randomly  generated  data.  Encryption  is  performed  in  physical  layer  by  optical 
components  and  so  is  decryption.  The  schematic  is  shown  in  Figure  2.  With  correct  keys, 
content  other  than  true  message  is  effectively  ignored  by  intended  recipient.  But  an  eavesdropper 
must  record  the  entirety  of  message  +  dummy  content,  and  apply  extensive  processing  to  crack 
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the  code.  We  will  show  that  the  attack  has  to  be  brute  force  in  nature  and  requires  almost  infinite 
storage  capacity  and  computation  time. 

2.  Optical  Implementation  of  NIH 


Figure  3  Optical  Implementation  of  NIH 


Optical  technology  is  ideally  suited  for  implementation  of  NIH.  Random  data  (Haystack)  can 
be  generated  using  broadband  amplified  spontaneous  emission  (ASE)  from  Erbium-doped  fiber 
amplifiers  (EDFA).  A  notch  can  be  carved  in  the  ASE  spectrum  using  an  optical  filter.  One  can 
then  insert  the  true  signal  (Needle)  in  its  place'.  The  Size  of  haystack  is  the  bandwidth  of  ASE 
Av  and  the  size  of  needle  is  the  bandwidth  of  the  signal  Av  as  shown  in  Fig.  3. 

ASE  ~  sig 


/ - 


signal  location 


Figure  4  Imperfect  stitching  betrays  signal  location  in  NIH. 

This  straightforward  NIH  implementation  has  an  obvious  vulnerability:  the  location  of  the 
signal  can  be  determined  by  observation  of  the  signal  in  the  spectral  domain.  An 
eavesdropper  can  identify  signal  location  by  observing  frequency  channel  that  deviates  from 
ASE,  which  is  a  Gaussian  white  noise.  Any  inaccuracy  in  stitching  of  signal  spectrum  into 
the  notch  betrays  location  of  true  signal  channel.  Figure  4  shows  an  example  in  which  the 


'  Pieper  et  al.  (Proc.  29th  Southeastern  Symp.  on  System  Theory,  p.  261-265,  1997)  embedded  a 
modulated  signal  in  broadband  noise. 
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notch  filter  does  not  fit  signal  spectrum  shape.  This  method  thus  requires  perfect  stitching 
and  leads  to  impractically  tight  tolerances  in  component  specifications. 


The  solution  to  this  vulnerability  turns  out  to  be  straightforward.  Frequency  hopping  of  the 
notch  filter  and  signal  laser  in  tandem  forces  eavesdropper  to  search  for  signal.  The  stitching 
inaccuracies  will  not  make  signal  visible  provided  frequency  hop  interval  is  short.  Intended 
recipient  follows  same  frequency  hop  sequence  and  thus  can  ignore  all  random  data  content. 
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Figure  5  Experimental  setup  (a)  and  result  (b)  for  stitching  in  optical  domain. 


We  have  demonstrated  that  stitching  error  can  indeed  to  be  made  small  so  that  it  is  not 
observable  in  a  certain  period  of  time.  In  Fig.  5,  the  observation  time  is  20  seconds  and  the 
stitching  is  perfect  when  observed  within  this  time  window.  As  long  as  frequency  hopping 
rate  is  greater  than  once  every  20  second,  then  there  is  no  stitching  error  because  it  is  below 
the  noise  level  of  the  ASE. 


Im 


QPSK  constellation  Gaussian  Noise 


Figure  6  Constellation  diagrams  of  a  deterministic  signal  and  ASE 

However,  there  is  another  vulnerability  of  the  frequency-hopping  NIH  method.  This  is 
because  a  slice  of  ASE  has  Gaussian  distribution  for  both  the  in-phase  &  quadrature 
components.  On  the  other  hand,  a  digital  signal  inherently  takes  on  a  set  of  discrete  values. 
The  constellation  diagrams  of  a  deterministic  signal  and  ASE  are  shown  in  Fig.  6.  An 
eavesdropper  can  exploit  this  difference  to  identify  the  true  information-bearing  signal  from 
the  noise. 
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The  information-bearing  signal  must  be  disguised  to  appear  like  optical  noise.  The  inphase  and 
quadrature  parts  of  one  SOP  of  an  optical  noise  field  are  each  Guassian  noises.  Optical  noise 
contains  power  in  both  polarization  states,  although  the  optical  noise  could  be  purposely 
polarized  for  this  application.  The  distribution  of  the  inphase  part  of  the  information-bearing 
signal  depends  on  what  modulation  format  is  used,  but  it  is  typically  not  Gaussian.  For  example, 
if  polarization  multiplexed  QAM  is  used  with  M ^  bits/symbol,  the  x-polarization  of  the  QAM 


signal  takes  on  complex  values  at  the  symbol  centers  {n  =  0,1,2,...),  defined  by  the 
information  to  be  transmitted.  The  distribution  /(Re[i/])  of  the  real  part  of  comprises 
delta  spikes  separated  by  d.  For  the  example  of  16  level  QAM,  /(Re[M])  comprises  four  delta 
spikes. 

The  first  noise-rendering  method  begins  by  adding  to  a  pseudorandom  complex  variable  , 
whose  real  and  imaginary  parts  each  follow  a  uniform  distribution 

/{Re[».])  =  i  -2<Re[w]<2 

a  2  2 


/(ReM)=0 


Re[w]  I  > 


/(lm[w])  =  ^ 
a 

/(lm[w])  =  0 
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The  distribution  of  appears  as  a  square  on  the  complex  plane.  can  be  obtained  from  a 
pseudorandom  data  sequence  which  is  generated  from  the  key.  The  real  and  imaginary  parts  of 
u^+w^  each  has  a  uniform  distribution  from  to  .  Then  the  sum  is 

transformed  into  a  quantity  having  a  Gaussian  distribution  by  taking  the  inverse  error  function  of 
each  part,  and  the  result  can  be  used  for  that  inphase  component  of  the  electric  field  envelope 


^.„ec('^^.)  =  ^/2^7inverf 


Re[M„+>v„] 


^  r 

+  i  V2  a  inverf 


ImK+wJ 


<j  is  the  standard  deviation  of  a  component  of  the  optical  noise,  to  which  the  signal  is  to  be 
matched.  We  have  built  a  simulation  tool,  schematically  shown  in  Fig.  7,  to  verify  the  effective 
of  this  noise  rendering  process.  The  simulation  was  carried  out  using  VPI  (for  PRBS  generation 
and  QAM  mapping,  and  coherent  optical  modulation)  and  Matlab. 


Fig.  7  Block  diagram  for  noise  rendering. 
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Figure  8  presents  the  constellation  diagram  of  the  QPSK  signal,  the  complex  noise  source,  the 
sum  of  the  above,  and  the  Gaussian  noise-like  electrical  field  of  the  light  after  noise  rendering. 
The  distribution  of  the  real  and  imaginary  part  of  optical  field  is  clearly  Gaussian  as  shown  in 
Fig.  9. 


Un  (QPSK  or  4QAM)  w„ 


^out 

Fig.  8  Constellation  diagram  of  the  QPSK  signal,  the  complex  noise  source,  the  sum  of  the  above,  and  the  Gaussian 

noise-like  electrical  field  of  the  light  after  noise  rendering. 
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Fig.  9  Distribution  of  the  real  and  imaginary  part  of  optical  field  after  noise  rendering. 


There  is  yet  another  potential  vulnerability  associated  with  frequency-hopping  NIH.  This  is 
because  a  slice  of  ASE  is  un-correlated,  i.e.,  autocorrelation  function  is  a  delta  function  while 
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a  digital  signal  inherently  has  correlation  due  to  limited  bandwidth  of  electrical  and  optical 
components  used  to  generated  noise-rendered  signal  as  shown  in  Fig.  10.  Potentially,  an 
eavesdropper  can  exploit  this  difference  to  identify  the  true  information-bearing  signal  from 
the  noise. 
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Figure  10  Autocorrelation  and  spectrum  for  ASE  and  noise-rendered  QPSK. 

Fortunately,  this  is  only  a  false  alarm.  Figure  11  shows  the  autocorrelation  function  and 
spectrum  for  the  relevant  signals.  ASE  with  notch  has  anti-correlation  in  the  vicinity  of  the 
correlation  peak  at  the  center.  The  autocorrelation  of  the  noise-rendered  QPSK  signal 
complements  the  anti-correlation.  As  a  result,  the  transmitted  signal  has  the  same 
autocorrelation  as  the  ASE  noise. 
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Figure  1 1  Autocorrelation  and  spectrum  for  ASE,  ASE  with  notch,  noise-rendered  QPSK  and  transmitted  signal. 
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3.  NIH  Security 


There  are  two  levels  of  security  that  the  NIH  secure  communication  method  affords.  The  first 
level  is  offered  in  the  bounded  storage  model  (BSM).  The  other  level  is  offered  by  the 
computational  complexity. 

3.1  NIH  Security  in  the  BSM 

Since  the  eavesdropper  must  inspect  enormous  decrypted  ciphertext  possibilities  to  find  message, 
due  to  frequency  hopping  and  long  key  in  digital  encryption,  the  amount  of  data  to  be  stored 
exceeds  any  reasonable  storage  capacity  available  to  an  eavesdropper.  In  other  words,  the 
eavesdropper  cannot  afford  to  even  record  ciphertext  to  decrypt  later.  The  concept  of  presenting 
eavesdropper  with  too  much  data  to  store  is  not  new^,  and  is  called  the  bounded  storage  model. 
Aumann  proposed  publicly  sending  random  number  sequence  too  large  to  store,  then  using 
portions  of  sequence  for  encrypted  communications  such  as  in  satellite  broadcast  of  200Gb/s 
random  data  used  by  many  encrypted  communication  links. 

For  the  NIH  secure  communication  scheme  described  in  Figure  3,  the  amount  of  information  to 
be  stored  by  an  eavesdropper  can  be  obtained  as  follows.  Assume  =  32nm  =  4000GHz. 

The  sample  rate  has  to  be  greater  than  or  equal  to  4000GSa/s  x  2  quadratures  x  2  SOPs  x  5  bits 
resolution.  This  translates  to  a  sample  rate  of  80Tb/s.  For  comparison,  rate  information  is  stored 
in  the  world  on  all  media  =  1.3Tb/s  ,  according  to  a  U.C.  Berkeley  studyin  2003. 

Therefore  it  is  safe  to  conclude  that  the  NIH  secure  communication  approach  is  extremely  robust 
in  the  bounded  storage  model. 

3.2  Strength  of  NIH  to  Brute  Force  Attack 

The  NIH  optical  encryption  method  ensures  that  the  eavesdropper  must  use  brute  force  attack  to 
decode.  We  use  a  weak  digital  encryption  method  namely  stream  cipher  using  XOR  with  a 
pseudorandom  bit  sequence  as  example  to  find  the  complexity  required  for  a  brute  force  attack. 
We  assume  that  the  PRBS  is  generated  from  a  long  key,  L  bits.  The  reason  for  using  stream 

Stream 

cipher  of  a  long  key  length  L  bits  is  that  eavesdropper  must  have  L  bits  correctly 

Stream  stream 

received  in  order  to  attempt  to  derive  the  key.  Further,  we  choose  the  number  of  bits  in  each 
frequency  hop  L  to  be  much  shorter  than  L  so  that  L  =  M  L  .  The  eavesdropper 

hop  Stream  stream  hop 

must  inspect  very  many  decrypted  ciphertext  possibilities  to  find  message.  The  number  of 
possibilities  is  set  by  the  ratio  of  random  data  content  to  message  size.  So  the  number  of 
possibility  is  determined  by  optical  component  parameters  and  not  by  amount  of  digital 
processing. 


^  Maurer,  J.  Cryptol.,  vol.  5,  p.  53-66,  1992 
^  IEEE  Trans,  on  Inf.  Theory,  vol.  48,  no.  6,  p.  1668-1680,  2002 
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We  will  determine  the  complexity  of  brute  force  attack  step  by  step.  First  consider  the  case  in 
which  the  signal  is  in  one  frequency  channel,  encrypted  by  stream  cipher  shown  in  Fig.  12  (a). 
Assume  a  linear  feedback  shift  register,  length  L  bits,  is  used  to  generate  the  PRBS  and  the 

key  is  the  seed  of  shift  register.  This  digital  encryption  uses  few  computations  even  though  key 
is  long.  By  itself,  it  is  a  weak  code  that  is  easily  broken  in  time  T 

cvcick 

Next  consider  the  case  in  which  the  signal  may  appear  in  any  one  of  N=Av^^^Av^.^  channels,  but 
no  frequency  hopping  shown  in  Fig.  12  (b).  The  eavesdropper  must  inspect  N=Av^^^Av^.^ 
possibilities  and  the  time  it  takes  to  break  the  code  would  be  NT  ,  . 

Finally  consider  M hops  between  N=A v  /Av ,  channels  shown  in  Fig.  12  (c).  To  assemble  true 

ASE  si^ 

data  content,  the  eavesdropper  must  inspect  N  permutations  and  the  time  it  takes  to  break  the 

M 

code  would  be  .  So  the  NIH  encryption  scheme  effectively  amplifies  time  taken  to 

M 

crack  stream  cipher  N  . 

Now  let’s  quantify  the  strength  of  NIH  in  terms  of  computation.  Take  a  reasonable  values  for 
lOGb/s  signal  embedded  in  EDFA  C-band  ASE,  zlv  =  3 2nm  =  4000GHz  Av  =  lOGHz, 

ASE  sig 

N=400,  frequency  hop  interval  =  100ns,  L  =  1000,  L  =  10000.  This  lead  to  M=10  and  the 

hop  stream 

10  26 

number  of  permutations  =  400  =  10  .  Assuming  T  =  Ips  (time  for  simply  storing  data),  the 
time  it  take  to  crack  code  is  3000  billion  years,  which  is  700  times  the  age  of  earth. 


(c) 


Figure  12  Schematic  illustration  of  permutations  in  NIH 
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Therefore,  the  NIH  encryption  method  resolves  the  vulnerability  of  digital  encryption  methods, 
that  a  moderate  strength  encryption  algorithm  today  is  breakable  in  the  future. 

4.  Conclusion 

In  this  project,  we  investigated  a  novel  physical  layer  secure  optical  communication  scheme. 
The  Needle-in-the-Hay stack  encryption  is  based  on  hiding  information  in  random  noise.  The 
signal  is  further  rendered  so  that  it  is  indistinguishable  from  noise  unless  a  correct  key  is 
available.  The  NIH  encryption  method  offers  two  levels  of  security.  It  is  found  that,  with  typical 
parameters  of  COTS  optical  devices,  the  eavesdropper  must  have  a  storage  capacity  of  at  least  80 
Tb/s  and  it  will  take  3000  billion  years  to  crack  the  code. 
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